Distributed File Recovery on the Lustre Distributed File System

نویسندگان

  • J. Albano
  • R. Seker
  • R. Babiceanu
  • S. Oral
چکیده

With the advancement of cloud-computing technologies and the growth in distributed software applications, a great deal of research that has been focused on the concepts and implementations of distributed file systems to support these application. Since its inception in 1999 by Peter Braam at Carnegie Mellon University, the Lustre distributed file system has gained both the technical, as well as financial interest of some of the largest technology entities, including Oracle, Seagate, Intel, Oak Ridge National Laboratory, and OpenSFS. With this immense backing, Lustre has been incorporated in over 60% of the TOP100 high performance computers in the world and is slated to significantly increase this market share. Although the Lustre file system itself has seen a sharp increase in research since its infancy, support for many of the fields surrounding the file system has been greatly lacking. Primary among these deficiencies is file recovery on the Lustre file system. This paper attempts to fill this gap and provides a simplified solution which is then developed into a distributed solution that can scale to meet the needs and requirements of various sizes of Lustre file system deployments. While this paper focuses on the Lustre file system, the concepts and solution provided in this paper can be used on any similar metadata-based distributed file system. Although this paper does not provide an implementation of this solution, a complete solution architecture is provided, enabling further research and implementation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lustre: A Scalable, High-Performance File System

Today's network-oriented computing environments require high-performance, network-aware file systems that can satisfy both the data storage requirements of individual systems and the data sharing requirements of workgroups and clusters of cooperative systems. The Lustre File System, an open source, high-performance file system from Cluster File Systems, Inc., is a distributed file system that e...

متن کامل

Applicability of parallel file systems for technical simulations: a case study

The lack of balance in processor speed and input/output performance of modern computers presents a challenging problem in high performance computing. Parallel and distributed file systems are one of the solutions to this problem because of their ability to distribute the input/output load over the network and improve the performance of clusters in order to meet the demands of applications for t...

متن کامل

Personalized Cloud Storage System: A Combination of LDAP Distributed File System

“Cloud computing” gradually flourish, a wide range of distributed storage systems are increasingly diverse, Like of Gluster, Ceph, Lustre, as well as Hadoop, etc.. In this paper, we propose a personal cloud storage system Integrated with pNFS, it can be accessed in parallel for scalable performance. Besides, data backup and failover mechanism are designed. We expect that the function of the pro...

متن کامل

Distributed Lustre activity tracking

Numerous administration tools and techniques require near real time vision of the activity occuring on a distributed filesystem. The changelog facility provided by Lustre to address this need suffers limitations in terms of scalability and flexibility. We have been working on reducing those limitations by enhancing Lustre itself and developing external tools such as Lustre ChangeLog Aggregate a...

متن کامل

Analysis of Long-Term File System Activities on Cluster Systems

I/O workload is a critical and important factor to analyze I/O pattern and to maximize file system performance. However to measure I/O workload on running distributed parallel file system is non-trivial due to collection overhead and large volume of data. In this paper, we measured and analyzed file system activities on two large-scale cluster systems which had TFlops level high performance com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016